internal memory
Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Robust Response Generation in the Wild
Wang, Jiatai, Xu, Zhiwei, Jin, Di, Yang, Xuewen, Li, Tao
The proliferation of large language models (LLMs) has significantly advanced intelligent systems. Unfortunately, LLMs often face knowledge conflicts between internal memory and retrieved external information, arising from misinformation, biases, or outdated knowledge. These conflicts undermine response reliability and introduce uncertainty in decision-making. In this work, we analyze how LLMs navigate knowledge conflicts from an information-theoretic perspective and reveal that when conflicting and supplementary information exhibit significant differences, LLMs confidently resolve their preferences and alleviate the uncertainty during their response generation. When this difference is ambiguous, LLMs experience considerable uncertainty about their generation. Based on this insight, we propose Swin-VIB, a novel framework that integrates a pipeline of variational information bottleneck models to adapt the retrieved information difference, facilitating robust response generation of LLMs even in conflicting contexts. Extensive experiments confirm our theoretical analysis and demonstrate the performance of Swin-VIB. Notably, Swin-VIB outperforms all competitive baselines in terms of the accuracy of the multiple-choice task, while improving the EM values in the open-ended QA task by at least 11.14%.
- Asia > China > Tianjin Province > Tianjin (0.04)
- Asia > Mongolia (0.04)
- Asia > China > Inner Mongolia (0.04)
- (3 more...)
On Policy Reuse: An Expressive Language for Representing and Executing General Policies that Call Other Policies
Bonet, Blai, Drexler, Dominik, Geffner, Hector
Recently, a simple but powerful language for expressing and learning general policies and problem decompositions (sketches) has been introduced in terms of rules defined over a set of Boolean and numerical features. In this work, we consider three extensions of this language aimed at making policies and sketches more flexible and reusable: internal memory states, as in finite state controllers; indexical features, whose values are a function of the state and a number of internal registers that can be loaded with objects; and modules that wrap up policies and sketches and allow them to call each other by passing parameters. In addition, unlike general policies that select state transitions rather than ground actions, the new language allows for the selection of such actions. The expressive power of the resulting language for policies and sketches is illustrated through a number of examples.
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.34)
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
Jin, Zhuoran, Cao, Pengfei, Yuan, Hongbang, Chen, Yubo, Xu, Jiexin, Li, Huaijun, Jiang, Xiaojian, Liu, Kang, Zhao, Jun
Recently, retrieval augmentation and tool augmentation have demonstrated a remarkable capability to expand the internal memory boundaries of language models (LMs) by providing external context. However, internal memory and external context inevitably clash, leading to knowledge conflicts within LMs. In this paper, we aim to interpret the mechanism of knowledge conflicts through the lens of information flow, and then mitigate conflicts by precise interventions at the pivotal point. We find there are some attention heads with opposite effects in the later layers, where memory heads can recall knowledge from internal memory, and context heads can retrieve knowledge from external context. Moreover, we reveal that the pivotal point at which knowledge conflicts emerge in LMs is the integration of inconsistent information flows by memory heads and context heads. Inspired by the insights, we propose a novel method called Pruning Head via PatH PatcHing (PH3), which can efficiently mitigate knowledge conflicts by pruning conflicting attention heads without updating model parameters. PH3 can flexibly control eight LMs to use internal memory ($\uparrow$ 44.0%) or external context ($\uparrow$ 38.5%). Moreover, PH3 can also improve the performance of LMs on open-domain QA tasks. We also conduct extensive experiments to demonstrate the cross-model, cross-relation, and cross-format generalization of our method.
- Europe > France (0.05)
- Asia > Singapore (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
Tug-of-War Between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models
Jin, Zhuoran, Cao, Pengfei, Chen, Yubo, Liu, Kang, Jiang, Xiaojian, Xu, Jiexin, Li, Qiuxia, Zhao, Jun
Retrieval-augmented language models (RALMs) have demonstrated significant potential in refining and expanding their internal memory by retrieving evidence from external sources. However, RALMs will inevitably encounter knowledge conflicts when integrating their internal memory with external sources. Knowledge conflicts can ensnare RALMs in a tug-of-war between knowledge, limiting their practical applicability. In this paper, we focus on exploring and resolving knowledge conflicts in RALMs. First, we present an evaluation framework for assessing knowledge conflicts across various dimensions. Then, we investigate the behavior and preference of RALMs from the following two perspectives: (1) Conflicts between internal memory and external sources: We find that stronger RALMs emerge with the Dunning-Kruger effect, persistently favoring their faulty internal memory even when correct evidence is provided. Besides, RALMs exhibit an availability bias towards common knowledge; (2) Conflicts between truthful, irrelevant and misleading evidence: We reveal that RALMs follow the principle of majority rule, leaning towards placing trust in evidence that appears more frequently. Moreover, we find that RALMs exhibit confirmation bias, and are more willing to choose evidence that is consistent with their internal memory. To solve the challenge of knowledge conflicts, we propose a method called Conflict-Disentangle Contrastive Decoding (CD2) to better calibrate the model's confidence. Experimental results demonstrate that our CD2 can effectively resolve knowledge conflicts in RALMs.
- North America > Canada > Ontario > Toronto (0.05)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- Personal > Honors (0.49)
- Research Report > New Finding (0.34)
The R-algebra of Quasiknowledge and Convex Optimization
This article develops a convex description of a classical or quantum learner's or agent's state of knowledge about its environment, presented as a convex subset of a commutative R-algebra. With caveats, this leads to a generalization of certain semidefinite programs in quantum information (such as those describing the universal query algorithm dual to the quantum adversary bound, related to optimal learning or control of the environment) to the classical and faulty-quantum setting, which would not be possible with a naive description via joint probability distributions over environment and internal memory. More philosophically, it also makes an interpretation of the set of reduced density matrices as "states of knowledge" of an observer of its environment, related to these techniques, more explicit. As another example, I describe and solve a formal differential equation of states of knowledge in that algebra, where an agent obtains experimental data in a Poissonian process, and its state of knowledge evolves as an exponential power series. However, this framework currently lacks impressive applications, and I post it in part to solicit feedback and collaboration on those. In particular, it may be possible to develop it into a new framework for the design of experiments, e.g. the problem of finding maximally informative questions to ask human labelers or the environment in machine-learning problems. The parts of the article not related to quantum information don't assume knowledge of it.
Mobile Price Classification - Projects Based Learning
Bob has started his own mobile company. He wants to give a tough fight to big companies like Apple, Samsung etc. He does not know how to estimate the price of mobiles his company creates. In this competitive mobile phone market, you cannot simply assume things. To solve this problem he collects sales data of mobile phones of various companies.
- Education (0.40)
- Semiconductors & Electronics (0.36)
Extending Environments To Measure Self-Reflection In Reinforcement Learning
Alexander, Samuel Allen, Castaneda, Michael, Compher, Kevin, Martinez, Oscar
We consider an extended notion of reinforcement learning in which the environment can simulate the agent and base its outputs on the agent's hypothetical behavior. Since good performance usually requires paying attention to whatever things the environment's outputs are based on, we argue that for an agent to achieve on-average good performance across many such extended environments, it is necessary for the agent to self-reflect. Thus, an agent's self-reflection ability can be numerically estimated by running the agent through a battery of extended environments. We are simultaneously releasing an open-source library of extended environments to serve as proof-of-concept of this technique. As the library is first-of-kind, we have avoided the difficult problem of optimizing it. Instead we have chosen environments with interesting properties. Some seem paradoxical, some lead to interesting thought experiments, some are even suggestive of how self-reflection might have evolved in nature. We give examples and introduce a simple transformation which experimentally seems to increase self-reflection.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Spain (0.04)
Causal Analysis of Agent Behavior for AI Safety
Déletang, Grégoire, Grau-Moya, Jordi, Martic, Miljan, Genewein, Tim, McGrath, Tom, Mikulik, Vladimir, Kunesch, Markus, Legg, Shane, Ortega, Pedro A.
As machine learning systems become more powerful they also become increasingly unpredictable and opaque. Yet, finding human-understandable explanations of how they work is essential for their safe deployment. This technical report illustrates a methodology for investigating the causal mechanisms that drive the behaviour of artificial agents. Six use cases are covered, each addressing a typical question an analyst might ask about an agent. In particular, we show that each question cannot be addressed by pure observation alone, but instead requires conducting experiments with systematically chosen manipulations so as to generate the correct causal evidence.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Research Report > Strength High (0.46)
- Research Report > Experimental Study (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Influence-aware Memory for Deep Reinforcement Learning
Suau, Miguel, Congeduti, Elena, Starre, Rolf, Czechowski, Aleksander, Olihoek, Frans
Making the right decisions when some of the state variables are hidden, involves reasoning about all the possible states of the environment. An agent receiving only partial observations needs to infer the true values of these hidden variables based on the history of experiences. Recent deep reinforcement learning methods use recurrent models to keep track of past information. However, these models are sometimes expensive to train and have convergence difficulties, especially when dealing with high dimensional input spaces. Taking inspiration from influence-based abstraction, we show that effective policies can be learned in the presence of uncertainty by only memorizing a small subset of input variables. We also incorporate a mechanism in our network that learns to automatically choose the important pieces of information that need to be remembered. The results indicate that, by forcing the agent's internal memory to focus on the selected regions while treating the rest of the observable variables as Markovian, we can outperform ordinary recurrent architectures in situations where the amount of information that the agent needs to retain represents a small fraction of the entire observation input. The method also reduces training time and obtains better scores than methods that stack multiple observations to remove partial observability in domains where long-term memory is required.
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Recurrent Neural Networks and LSTM – Towards Data Science
Recurrent Neural Networks are the state of the art algorithm for sequential data and among others used by Apples Siri and Googles Voice Search. This is because it is the first algorithm that remembers its input, due to an internal memory, which makes it perfectly suited for Machine Learning problems that involve sequential data. It is one of the algorithms behind the scenes of the amazing achievements of Deep Learning in the past few years. In this post, you will learn the basic concepts of how Recurrent Neural Networks work, what the biggest issues are and how to solve them. Recurrent Neural Networks (RNN) are a powerful and robust type of neural networks and belong to the most promising algorithms out there at the moment because they are the only ones with an internal memory.